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Abstract 

We propose an Active Learning approach to training 
a segmentation classifier that exploits geometric priors to 
streamline the annotation process in 3D image volumes. To 
this end, we use these priors not only to select voxels most 
in need of annotation but to guarantee that they lie on 2D 
planar patch, which makes it much easier to annotate than 
if they were randomly distributed in the volume. A simplified 
version of this approach is effective in natural 2D images. 

We evaluated our approach on Electron Microscopy and 
Magnetic Resonance image volumes, as well as on natural 
images. Comparing our approach against several accepted 
baselines demonstrates a marked performance increase. 

1. Introduction 

Machine Learning techniques are a key component of 
modem approaches to segmentation, making the need for 
sufficient amounts of training data critical. As far as im¬ 
ages of everyday scenes are concerned, this is addressed by 
compiling ever larger training databases and obtaining the 
ground truth via crowd-sourcing [18, 17]. By contrast, in 
specialized domains such as biomedical image processing, 
this is not always an option both because the images can 
only be acquired using very sophisticated instruments and 
because only experts whose time is scarce and precious can 
annotate them reliably. 

Active Learning (AL) is an established way to reduce 
this labeling workload by automatically deciding which 
parts of the image an annotator should label to train the 
system as quickly as possible and with minimal amounts 
of manual intervention. However, most AL techniques used 
in Computer Vision, such as [14, 13, 34, 21], are inspired 
by earlier methods developed primarily for general tasks or 
Natural Language Processing [32, 15]. As such, they rarely 
account for the specific difficulties or exploit the opportuni¬ 
ties that arise when annotating individual pixels in 2D im¬ 
ages and 3D voxels in image volumes. 

More specifically, 3D stacks such as those depicted by 
Fig. 1 are common in the biomedical field and are particu¬ 


larly challenging, in part because it is difficult both to de¬ 
velop effective interfaces to visualize the huge image data 
and for users to quickly figure out what they are looking 
at. In this paper, we will therefore focus on image volumes 
but the techniques we will discuss are nevertheless also ap¬ 
plicable to regular 2D images by treating them as stacks of 
height one. 

With this, we introduce here a novel approach to AL that 
is geared towards segmenting 3D image volumes and also 
applicable to ordinary 2D images. By design, it takes into 
account geometric constraints to which regions should obey 
and makes the annotation process convenient. Our contribu¬ 
tion hence is twofold: 

• We introduce a way to exploit geometric priors to more 
effectively select the image data the expert user is 
asked to annotate. 

• We streamline the annotation process in 3D volumes 
so that annotating them is no more cumbersome than 
annotating ordinary 2D images, as depicted by Fig. 2. 

In the remainder of this paper, we first review current 
approaches to AL and discuss why they are not necessarily 
the most effective when dealing with pixels and voxels. We 
then give a short overview of our approach and discuss in 
more details how we use geometric priors and simplify the 
annotation process. Finally, we compare our results against 
those of accepted baselines and state-of-the-art techniques. 

2. Related Work and Motivation 

In this paper, we are concerned with situations where 
domain experts are available to annotate images. However, 
their time is limited and expensive. We would therefore like 
to exploit it as effectively as possible. In such a scenario, 
AL [27] is a technique of choice as it tries to determine the 
smallest possible set of training samples to annotate for ef¬ 
fective model instantiation. 

In practice, almost any classification scheme can be in¬ 
corporated into an AL framework. For image processing 
purposes, that includes SVMs [13], Conditional Random 
Fields [34], Gaussian Processes [14] and Random Forests 
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Figure 1. Interface of the FIJI Visualization API [25], which is 
extensively used to interact with 3D image stacks. The user is pre¬ 
sented with three orthogonal planar slices of the stack. While ef¬ 
fective when working slice by slice, this is extremely cumbersome 
for random access to voxels anywhere in the 3D stack, which is 
what a naive AL implementation would require. 


(a) 

Figure 2. Our approach to annotation. The system selects an opti¬ 
mal plane in an arbitrary orientation—as opposed to only xy, xz, 
and yz—and presents the user with a patch that is easy to annotate, 
(a) The annotated area shown as part of the full 3D stack, (b) The 
planar patch the user would see. It could be annotated by clicking 
twice to specify the red segment that forms the boundary between 
the inside and outside of a target object within the green circle. 
Best viewed in color. 

[21]. Typical strategies for query selection rely on uncer¬ 
tainty sampling [13], query-by-committee [7, 12], expected 
model change [29, 31, 34], or measuring information in the 
Fisher matrix [11]. 

These techniques have been used for tasks such as Nat¬ 
ural Language Processing [15, 32, 24], Image Classifica¬ 
tion [13, 11], and Semantic Segmentation [34, 12]. How¬ 
ever, selection strategies are rarely designed to take advan¬ 
tage of image specificities when labeling individual pixels 
or voxels, such as the fact that neighboring ones tend to 
have the same labels or that boundaries between similarly 


labeled ones are often smooth. The segmentation methods 
presented in [16, 12] do however take such geometric con¬ 
straints into account at classifier level but not in AL query 
selection, as we do. 

Similarly, batch-mode selection [28, 11, 29, 5] has be¬ 
come a standard way to increase the efficiency by asking the 
expert to annotate more than one sample at a time [23, 2]. 
But again, this has been mostly investigated in terms of se¬ 
mantic queries without due consideration to the fact that, 
in images, it is much easier for annotators to quickly la¬ 
bel many samples in a localized image patch than hav¬ 
ing to annotate random image locations. In 3D image vol¬ 
umes [16,12, 8], it is even more important to provide the an¬ 
notator with a patch in a well-defined plane, such as the one 
shown in Fig. 2, rather than having him move randomly in 
a complicated 3D volume, which is extremely cumbersome 
using current 3D image display tools such as the popular 
FIJI platform depicted by Fig. 1. The technique of [33] is 
an exception in that it asks users to label objects of interest 
in a plane of maximum uncertainty. Our approach is similar, 
but it also incorporates geometric constraints in query selec¬ 
tion and as we show in the result section, it outperforms the 
earlier method. 

3. Approach 

We begin by broadly outlining our framework, which is 
set in a traditional AL context. That is, we wish to train a 
classifier for segmentation purposes, but have initially only 
few labeled and many unlabeled training samples at our dis¬ 
posal. 

Since segmentation of 3D volumes is computationally 
expensive, supervoxels have been extensively used to speed 
up the process [3, 20]. We therefore formulate our prob¬ 
lem in terms of classifying supervoxels as either part of 
a target object or not. As such, we start by oversegment- 
ing the image using the SLIC algorithm [1] and computing 
for each resulting supervoxel Si a feature vector x^. Note 
that SLIC superpixels/supervoxels are always roughly cir¬ 
cular/spherical, which allows us to characterize them by 
their center and radius. 

Our AL problem thus involves iteratively finding the next 
set of supervoxels that should be labeled by an expert to 
improve segmentation performance as quickly as possible. 
To this end, our algorithm proceeds as follows: 

1. Using the already manually labeled voxels Sl, we 
train a task specific classifier and use it to predict for 
all remaining voxels Su the probability of being fore¬ 
ground or background. 

2. Next, we score unlabeled supervoxels on the basis of 
a novel uncertainty function that combines traditional 
Feature Uncertainty with Geometric Uncertainty. Es¬ 
timating the former usually involves feeding the fea- 










tures attributed to each supervoxel to the previously 
trained classifier. To compute the latter, we look at the 
uncertainty of the label that can be inferred based on 
a supervoxel’s distance to its neighbors and their pre¬ 
dicted labels. By doing so, we effectively capture the 
constraints imposed by the local smoothness of the im¬ 
age data. 

3. We then automatically select the best plane through the 
3D image volume in which to label additional samples, 
as depicted in Fig. 2. The expert can then effortlessly 
label the supervoxels from a circle in the selected plane 
by defining a line separating target from non-target re¬ 
gions. This removes the need to examine the relevant 
image data from multiple perspectives, as depicted in 
Fig. 1, and simplifies the labeling task. 

The process is then repeated. In Sec. 4, we will discuss the 
second step and in Sec. 5 the third. In Sec. 6, we will demon¬ 
strate that this pipeline yields faster learning rates than com¬ 
peting approaches. 

4. Geometry-Based Active Learning 

Most AL methods were developed for general tasks and 
operate exclusively in feature space, thus ignoring the geo¬ 
metric properties of images and more specifically their ge¬ 
ometric consistency. To remedy this, we introduce the con¬ 
cept of Geometric Uncertainty and then show how to com¬ 
bine it with more traditional Feature Uncertainty. 

Our basic insight is that supervoxels that are assigned a 
label other than that of their neighbors ought to be consid¬ 
ered more carefully than those that are assigned the same 
labels. In other words, under the assumption that neighbors 
more often than not have identical labels, the chance of the 
assigment being wrong is higher. This is what we refer to as 
Geometric Uncertainty and we now formalize it. 

4.1. Feature Uncertainty 

For each supervoxel Si and each class y, let peiVi = 
^|x^) be the probability that its class pi is y, given the cor¬ 
responding feature vector x^. In this work, we will assume 
that this probability can be computed by means of a classi¬ 
fier which has been trained using parameters 0 and we take 
^ to be 1 if the supervoxel belongs to the foreground and 0 
otherwise ^. In many AL algorithms, the uncertainty of this 
prediction is taken to be the Shannon entropy 

Hf = - ^2 = y|xi)- (1) 

Se{o.i} 

We will refer to this uncertainty estimate as the Feature Un¬ 
certainty. 

^Extensions to multiclass cases can be similarly derived. 



Figure 3. Image represented as a graph: we treat supervoxels as 
nodes in the graphs and edge weights between them reflect the 
probability of transition of the same label to a neighbour. Su¬ 
pervoxel Si has k neighbours from Ak{i) = {si, S 2 ?--j-s^}, 
priyi = y\y 3 = y) is the probability of node Si having the same 
label as node s}, pe{yi — y\^i) is the probability that yi , class of 
Si, is y, given only the corresponding feature vector 

4.2. Geometric Uncertainty 

Note that estimating the uncertainty as described above 
completely ignores the correlations between neighboring 
supervoxels. To account for them, we can estimate the en¬ 
tropy of a different probability, specifically the probability 
that supervoxel Si belongs to class y given the classifier pre¬ 
dictions of its neighbors and which we denote PG{yi = ^)- 

To this end, we treat the supervoxels of a single im¬ 
age volume as nodes of a weighted graph G whose edges 
connect neighboring ones, as depicted in Fig. 3. We let 
^k{si) = ^ 2 ,5^} be the set of k nearest neighbors 

of Si and assign a weight inversely proportional to the Eu¬ 
clidean distance between the voxel centers to each one of 
the edges. For each node Si, we normalize the weights of all 
incoming edges so that their sum is one and treat this as the 
probability priUi = y\yj = y) of node Si having the same 
label as node s'j G Ak{si). In other words, the closer two 
nodes are, the more likely they are to have the same label. 

To infer PG{yi = y)^^^ then use the well-studied Ran¬ 
dom Walk strategy G [19], as it refiects well our smoothness 
assumption and has been extensively used for image seg¬ 
mentation purposes [9, 33]. Given the PT{yi = y\yj = y) 
transition probabilities, we can compute the probabilities 
Pg iteratively by initially taking p%{yi = y) to be 

Pcivi = y) = ^ Privi = y\yj = y)pe{yj = y\^j) , (2) 

Sj G (s^) 

and then iteratively computing 

p^^^iyi = y) = X] priyi = y\yj = y)PGiyj = y) • G) 

The procedure describes the propagation of labels to su¬ 
pervoxels from its neighborhood. The number of iterations 
^max defines the radius of the neighborhood involved in the 




computation of pc for Si and encodes the smoothness pri¬ 
ors. 

Given these probabilities, we can now take the Geomet¬ 
ric Uncertainty to be 

Hf = - ^ pcivi = y) \ogpG{yi = y) , (4) 

as we did in Sec. 4.1 to estimate the Feature Uncertainty. 

4.3. Combining Feature and Geometric Entropy 

As discussed above, from a trained classifier we can thus 
estimate the Feature and Geometric Uncertainties. To use 
them jointly, we should in theory estimate the joint prob¬ 
ability distribution pe^ciVi = and the corresponding 
joint entropy. As this is computationally intractable in our 
model, we take advantage of the fact that the joint entropy 
is upper bounded by the sum of individual entropies and 
Thus, for each supervoxel, we take the Combined Un¬ 
certainty to be 

= Hf + Hf‘ (5) 

that is, the upper bound of the joint entropy. 

In practice, using this measure means that supervoxels 
that individually receive uncertain predictions and are in ar¬ 
eas of transition between foreground and background will 
be considered first. 

5. Batch-Mode Geometry Query Selection 

The simplest way to exploit the Combined Uncertainty 
introduced in Sec. 4.3 would be to pick the most uncer¬ 
tain supervoxel, ask the expert to label it, retrain the clas¬ 
sifier, and iterate. A more effective way however is to find 
appropriately-sized batches of uncertain supervoxels and 
ask the expert to label them all before retraining the clas¬ 
sifier. As discussed in Sec. 2, this is referred to as batch¬ 
mode selection. A naive implementation of this would force 
the user to randomly view and annotate supervoxels in the 
volume regardless of where they are, which would be ex¬ 
tremely cumbersome. 

In this section, we therefore introduce an approach to us¬ 
ing the uncertainty measure to first select a planar patch in 
3D volumes and then to allow the user to quickly label pos¬ 
itives and negatives within it, a shown in Fig. 2. 

Since we are working in 3D and there is no preferen¬ 
tial orientation in the data to work with, it makes sense to 
look for spherical regions where the uncertainty is maximal. 
However, for practical reasons, we only want the annota¬ 
tor to consider circular regions within planar patches such 
as the one depicted in Fig. 2 and Fig. 4. These can be un¬ 
derstood as the intersection of the sphere with a plane of 
arbitrary orientation. 

Formally, let us consider a supervoxel Si and let Vi to be 
the set of planes bisecting the image volume and containing 
the center of Si. Each plane pi G Vi can be parameterized 



Figure 4. Coordinate system for defining planes. Plane pi (yellow) 
is defined by two angles 0 - intersection between plane p and plane 
XsiY (blue), 7 - intersection between plane pi and plane YsiZ 
(red). Best seen in color. 

by two angles (j) G (0, tt) and 7 G (0, tt) and its origin can 
be taken to be the center of Si, as depicted by Fig. 4. In 
addition, let Cl{pi) be the set of supervoxels lying on pi 
within distance r > 0 to its origin, shown in green in Fig. 2. 

Recall from Sec. 4, that we have defined the uncertainty 
of a supervoxel as either the Feature Uncertainty of Eq. (1), 
the Geometric Uncertainty of Eq. (4), or the Combined Un¬ 
certainty of Eq. (5). In other words, we can associate each 
Si with an uncertainty value > 0 in one of three ways. 
Whichever way we choose, finding the circle of maximal 
uncertainty can be formulated as finding 

P* = = argmax Uj . ( 6 ) 

sjec-ipi) 

In practice, we find the most uncertain plane p* for the t 
most uncertain supervoxels Si and present the overall un¬ 
certain plane p* to the annotator. Since Ui > 0 and Eq. ( 6 ) 
is linear in Ui, we designed a branch-and-bound approach to 
solving Eq. 6 . It uses a bounding function to quickly elimi¬ 
nate entire parts of the parameter space until it is reduced to 
a singleton. By contrast to an exhaustive search that would 
be excruciatingly slow, our current MATLAB implementa¬ 
tion on the 10 images of resolution 176 x 170 x 220 of 
MRI dataset of Sec. 6.3 takes 0.12s per plane selection. This 
means that a C implementation would be real-time, which 
is critical to such an interactive method to being accepted 
by users. We discuss our implementation in more details in 
the supplementary material. 

Note that when the radius r = 0, this reduces to what 
single-supervoxel labeling does. By contrast, for r > 0, 
this allows annotation of many uncertain supervoxels with a 







few mouse clicks, as will be discussed further in Sec. 6. Al¬ 
though planar selection can be applied to any type of uncer¬ 
tainty value, we believe that it is the most beneficial when 
combined with Geometric Uncertainty as the latter already 
takes into account the most uncertain regions instead of iso¬ 
lated supervoxels. 

6. Experiments 

In this section, we evaluate our full approach both on two 
different Electron Microscopy (EM) datasets and a Mag¬ 
netic Resonance Imaging (MRI) one. We then demonstrate 
that a simplified version is effective for natural 2D Images. 

6.1. Setup and Parameters 

Eor all our experiments, we used Boosted Trees selected 
by Gradient Boosting [30, 4] as our underlying classifier. 
Given that during early AL iterations rounds, only limited 
amounts of training data are available, we limit the depth 
of our trees to 2 to avoid over-fitting. Eollowing standard 
practice, individual trees are optimized using 40% — 60% 
of the available training data chosen at random and 10 to 
40 features are explored per split. We set the number k of 
nearest neighbors of Section 4.2 to be the number of imme¬ 
diately adjacent supervoxels on average, which is between 
7 and 15 depending on the resolution of the image and size 
of supervoxels. However, experiments showed that the al¬ 
gorithm is not very sensitive to the choice of this parameter. 
We restrict the size of each planar patch to be small enough 
to contain typically not more than part of one object of in¬ 
terest. To this end, the we take the radius r of Section 5 to 
be between 10 and 15, which yields patches such as those 
depicted by Eig. 5. 

Baselines. Eor each dataset, we compare our approach 
against several baselines. The simplest is Random Sampling 
(Rs), that is, randomly selecting samples to be labeled. It 
serves to gauge the difficulty of the segmentation problem 
and quantify the improvement brought by the more elabo¬ 
rate strategies. 

The next simplest, but widely accepted approach is to 
perform Uncertainty Sampling [5, 18] of supervoxel by us¬ 
ing the uncertainty measures of Section 4. Let be the 
uncertainty score we use in a specific experiment. The strat¬ 
egy then is to select 

5* = argmax(i7^^). (7) 

sieSu 

We will refer to this as FUs when using the Eeature Un¬ 
certainty of Eq. 1 and as CUs when using the Combined 
Uncertainty of Eq. 5. Eor the Random Walk, iterative pro¬ 
cedure with Tmax = 20 leads to high learning rates in our 
applications. Einally, the most sophisticated approach is to 
use Batch-Mode Geometry Query Selection, as described 


in Sec. 5, in conjunction with either Eeature Uncertainty or 
Combined Uncertainty. We will refer to the two resulting 
strategies as pFUs and pCUs, respectively. Both plane se¬ 
lection strategies are using t = 5 best supervoxels in the 
optimization. Further increase of this value didn’t demon¬ 
strate significant growth of the learning rate. 

Fig. 2, 5 jointly depict what a potential user would see 
for pFUs and pCUs given a small enough patch radius. 
Given a well designed interface, it will typically require to 
click only once or twice to provide the required feedback 
(see Fig. 5). In our performance evaluation, we will there¬ 
fore estimate that each intervention of the user for pFUs 
and pCUs requires two clicks whereas for Rs, FUs, and 
CUs it requires only one. So, for the method comparison 
we measure annotation effort as 1 for Rs, FUs, and CUs 
and as 2 for pFUs and pCUs. 

Note that pFUs is similar in spirit to the approach 
of [33] and can therefore be taken as a good indicator of 
how this other method would perform on our data. How¬ 
ever, unlike [33], we do not require user to label the whole 
plane and keep our suggested interface for a fairer compar¬ 
ison. 

Adpative Thresholding. Recall from Section 4.1 that for 
all the approaches discussed here, the probability of a su¬ 
pervoxel being foreground is computed as pe{yi = = 

+ where F is the output of the classifier 

and h is the threshold [10]. Usually, the threshold is cho¬ 
sen by cross-validation but this strategy may be misleading 
or not even possible for AL. We therefore assume that the 
scores of training samples in each class are Gaussian dis¬ 
tributed with unknown parameters /i and a. We then find 
an optimal threshold /i* by fitting Gaussian distributions to 
the scores of positive and negative classes and choosing the 
value that yields the smallest Bayesian error, as depicted by 
Fig. 6(a). We refer to this approach as Adaptive Threshold¬ 
ing and we use it for all our experiments. Fig. 6(b) depicts 
the value of the selected threshold as increasing amounts of 
annotated data become available. Note that different strate¬ 
gies yield varying convergence speeds and that the plane- 
based strategies (pFUs and pCUs) converge fastest to a 
stable value. 

Experimental Protocol. In all cases, we start with 5 pos¬ 
itive and 5 negative labeled supervoxels and perform AL 
iterations until we receive 100 inputs from the user. Each 
method starts with the same random subset of samples and 
each experiment is repeated A" = 40 times. We will there¬ 
fore plot not only accuracy results but also indicate the vari¬ 
ance of these results. 

Fully annotated volumes of ground truth are available for 
us and we use them to simulate the expert’s intervention in 
our experiments. We detail the specific features we used for 
EM, MRI, and natural images below. 




(a) 



(b) 

Figure 6. (a) Estimate mean and standard deviation for classifier 
scores of positive class datapoints (red) - and and negative 
class datapoints (blue) - ii~ ,a~, and fit 2 Gaussian distributions. 
Given their pdf estimate optimal Bayesian error with threshold h*. 
(b) Adaptive Thresholding convergence rate of classifier threshold 
for different AL strategies. 

6.2. Results on EM data 

Here, we work with two 3D Electron Microscopy stacks 
of rat neural tissue, one from the striatum and the other 
from the hippocampus. One stack of size 318 x 711 x 422 
(165x1024 X 653 for hippocampus) is used for training and 
another stack of size 318 x 711 x 450 (165 x 1024 x 883) is 
used to evaluate the performance. Their resolution is 5nm in 
all three spatial orientations. The slices of Fig. 1 as well as 
patches in the upper row in Fig. 5 (a) come from the stria¬ 
tum and hippocampus volume is shown in Fig. 8 (a) with its 
patches shown in the lower row of Fig. 5 (a). 

The task is to segment mitochondria, which are the in¬ 
tracellular structures that supply the cell with its energy and 
are of great interest to neuroscientists. It is extremely labori¬ 
ous to annotate sufficient amounts of training data for learn¬ 
ing segmentation algorithms to work satisfactorily. Further¬ 
more, different brain areas have different characteristics, 
which means that the task must be repeated often. The fea¬ 


tures we feed our Boosted Trees rely on local texture and 
shape information using ray descriptors and intensity his¬ 
tograms as in [20]. 


Dataset 

FUs 

CUs 

pFUs 

pCUs 

Hippocampus 

0.1172 

0.1009 

0.0848 

0.0698 

Striatum 

0.1326 

0.1053 

0.1133 

0.0904 

MRI 

0.0758 

0.0642 

0.0767 

0.0545 

Natural 

0.1448 

0.1389 

0.1494 

0.1240 


Table 1. Variability of results by different AL strategies. 80% of 
the scores are lying within the indicated interval. Feature Uncer¬ 
tainty is always more variable that Combined Uncertainty, batch 
selection is always less variable that single-instance selection. The 
best result is highlighted in bold. 


In Fig. 7, we plot the performance of all the approaches 
we consider in terms of the VOC [6] score, a commonly 
used measure for this kind of application, as a function of 
the annotation effort. The horizontal line at the top depicts 
the VOC scores obtained by using the whole training set, 
which comprises 276130 and 325880 supervoxels, for the 
striatum and the hippocampus respectively. FUs provides a 
boost over Rs, and CUs yields a larger one. In both cases, a 
further improvement is obtained by introducing the batch¬ 
mode geometry query selection of pFUs and pCUs, with 
the latter coming on top. Recall that these numbers are aver¬ 
ages over many runs. In Table 1, we give the corresponding 
variances. Note that both using the Geometric Uncertainty 
and the batch-mode tend to reduce them, thus making the 
process more predictable. Note also the 100 number we use 
very much smaller than the total number of available sam¬ 
ples. 

Somewhat surprisingly, in the hippocampus case, the 
classifier performance given only 100 training data points is 
higher that the one obtained by using all the training data. 
In fact, this phenomenon has been reported in the AL lit¬ 
erature [26] and suggests that a well chosen subset of dat¬ 
apoints can produce better generalisation performance than 
the complete set. 

6.3. Results on MRI data 

Here we consider multimodal brain tumor segmentation 
in MRI brain scans. Segmentation quality depends critically 
on the amount of training data and only highly-trained ex¬ 
perts can provide it. Tl, T2, FLAIR, and post-Gadolinium 
T1 MR images are available in the BRATS dataset for each 
one of 20 subjects [22]. We use standard filters such as 
Gaussian, gradient filter, tensor, Laplacian of Gaussian and 
Hessian with different parameters to compute the feature 
vectors we feed to our Boosted Trees. 
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Figure 8. Examples of 3D datasets, a) Hippocampus volume for 
mitochondria segmentation b) MRI data for tumor segmentation 
(Flair image). 

In Fig. 9, we plot the performance of all the approaches 
we consider in terms of the dice score [8], a commonly used 
quality measure for brain tumor segmentation, as a function 
of the annotation effort and in Table 1, we give the cor¬ 
responding variances. We observe the same pattern as in 
Fig. 7, with pCUs again doing best. 

The patch radius parameter r of Sec. 5 plays an impor¬ 
tant role in plane selection. To evaluate its influence, we 
recomputed our pCUs results 50 times using three different 
values for r = 10, 15 and 20. The resulting plots are shown 
in Fig. 9. With a larger radius, the learning-rate is slightly 



Figure 10. Comparison of various AL strategies for segmentation 
of natural images. 


higher as could be expected from since more voxels are la¬ 
beled each time. However, as the patches become larger, it 
stops being clear that this can be done with only two mouse 
clicks and that is why we limited ourselves to radius sizes 
of 10 to 15. 

6.4. Natural Images 

Finally, we turn to natural 2D images and replace su¬ 
pervoxels by superpixels. In this case, the plane selection 
of pFUs and pCUs reduces to simple selection of image 
patches in the image. In practice, we simply select super¬ 
pixels with their 4 neighbors. Increasing this number would 
lead to higher learning rates in the same way as increas¬ 
ing the patch radius r, but we restrics it to a small value to 
ensure labelling can be done with 2 mouse clicks on aver¬ 
age. To compute image features, we use Gaussian, Lapla- 
cian, Laplacian of Gaussian, Prewitt, Sobel Alters to Alter 
intensity and color values, gather flrst-order statistics such 
as local standard deviation, local range, gradient magnitude 
and direction histograms, as well as SIFT features. 

We plot our results on the Weizmann horse database in 
Fig. 10 and give the corresponding variances in Table 1. 
The pattern is again similar to the one observed in Figs. 7 
and 9, with the difference between CUs and pCUs being 
smaller due to the fact that 2D batch-mode approach is 
much less sophisticated than the 3D one. Note, however, 
that the first few iterations are disastrous for ah methods, 
however, plane-based methods are able to recover from it 
quite fast. 

7. Conclusion 

In this paper we introduced an approach to exploiting the 
geometric priors inherent to images to increase the effec¬ 
tiveness of Active Learning for segmentation purposes. For 
2D images, it relies on an approach to Uncertainty Sampling 
that accounts not only for the uncertainty of the prediction 
at a specific location but also in its neighborhood. For 3D 
image stacks, it adds to this the ability to automatically se¬ 
lect a planar patch in which manual annotation is easy to 
do. 

We have formulated our algorithms in terms of back¬ 
ground/foreground segmentation but the entropy functions 
that we use to express our uncertainties can handle multiple 
classes with little change to the overall approach. In future 
work, we will therefore extend our approach to more gen¬ 
eral segmentation problems. 
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Figure 5. Circular patches to be annotated by the expert highlighted by the yellow circle in (a) Electron Microscopy data, (b) MRI data, 
and (c) natural images. The patches can be entirely foreground, entirely background. Alternatively, the boundary between foreground an 
background within the patch can be indicated by tracing a red line segment. In all cases, that would require at most two mouse clicks. 



Figure 7. Comparison of various AL strategies for mitochondria segmentation. Left: striatum dataset, right: hippocampus dataset. 
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Figure 9. Comparison of various AL strategies for MRI data for tumor segmentation. Left: dice score for BRATS2012 dataset, right: pCUs 
strategy with patches of different radius. 
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